Geocoding location expressions in Twitter messages: A preference learning method

نویسندگان

  • Wei Zhang
  • Judith Gelernter
چکیده

Resolving location expressions in text to the correct physical location, also known as geocoding or grounding, is complicated by the fact that so many places around the world share the same name. Correct resolution is made even more difficult when there is little context to determine which place is intended, as in a 140-character Twitter message, or when location cues from different sources conflict, as may be the case among different metadata fields of a Twitter message. We used supervised machine learning to weigh the different fields of the Twitter message and the features of a world gazetteer to create a model that will prefer the correct gazetteer candidate to resolve the extracted expression. We evaluated our model using the F1 measure and compared it to similar algorithms. Our method achieved results higher than state-of-the-art competitors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Carmen: A Twitter Geolocation System with Applications to Public Health

Public health applications using social media often require accurate, broad-coverage location information. However, the standard information provided by social media APIs, such as Twitter, cover a limited number of messages. This paper presents Carmen, a geolocation system that can determine structured location information for messages provided by the Twitter API. Our system utilizes geocoding ...

متن کامل

Using Multi-View Learning to Improve Detection of Investor Sentiments on Twitter

Stock-related messages on social media have several interesting properties regarding the sentiment analysis (SA) task. On the one hand, the analysis is particularly challenging, because of frequent typos, bad grammar, and idiosyncratic expressions specific to the domain and media. On the other hand, stock-related messages primarily refer to the state of specific entities companies and their sto...

متن کامل

Detecting Emergency Events and Geo-Location Awareness from Twitter Streams

the rapidly increasing number of messages on twitter is quite interesting. Through twitter streaming, this paper is capable of delivering tweets for any keywords from clients all around the world or Hashtag in real-time. However, semantic topic extraction and tracking the userinterested news events from messages on twitter can be considered as a challenging task. In this paper focused on detect...

متن کامل

Selecting Quality Twitter Content for Events

Social media sites such as Twitter contain large amounts of user contributed messages for a wide variety of real-world events. While some of these “event messages” might contain interesting and useful information (e.g., event time, location, participants, opinions), others might provide little value (e.g., using heavy slang, incomprehensible language) to people interested in learning about an e...

متن کامل

Recognizing Extended Spatiotemporal Expressions by Actively Trained Average Perceptron Ensembles

Precise geocoding and time normalization for text requires that location and time phrases be identified. Many state-of-the-art geoparsers and temporal parsers suffer from low recall. Categories commonly missed by parsers are: nouns used in a nonspatiotemporal sense, adjectival and adverbial phrases, prepositional phrases, and numerical phrases. We collected and annotated data set by querying co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Spatial Information Science

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014